Linux Cubed Series 7: Sunsite

home *** CD-ROM | disk | FTP | other *** search

/ Linux Cubed Series 7: Sunsite / Linux Cubed Series 7 - Sunsite Vol 1.iso / system / admin / linuxcon.000 / linuxcon / linuxconf-1.6 / translate / translat.doc < prev next >

Wrap

Text File | 1996-04-15 | 16.9 KB | 443 lines

Translation system for Linuxconf Introduction Linuxconf is a large software component, full of menus, and dialogs. To be easily translatable, all messages must be extracted from the C++ source code and place into dictionnaries which can be translated effi- ciently. A special set of tools has been designed to achieve this. They are described here. 1. Introduction This document describes both how the system works and how translators can use it. It starts by explaining how programmers can use it to produce translatable programs. The section "how to translate" explains how translators can use this system to translate linuxconf or any programs written using this system. 2. Principles To make programs easily translatable, all messages should be placed in dictionnaries. A dictionnary is made of message entries. Each message has a unique ID and a value. In the C++ source, programmers are refering to those messages using the ID whenever they want to print or say something. Each time a programmer need a new message, he has to add it in the message dictionnary and reference it from the C++ source code. This is how most system works (There are other translation system out there). The system used by Linuxconf is basically different. Messages are defined in the C++ source code and the dictionnaries are built by scanning all C++ source files. Messages are defined in the C++ code. Programmers must provide and ID and a value for each message right in the source code. This is much easier (or nicer) to do this right in the source code than to go back and forth in the dictionnary. Furthermore, the programmer directly see the message definition in the source. With other system, only the message ID is visible in the source. Using the magic of the C preprocessor, the message value is not compiled in the object code at all. Seen this way, the translation system used by Linuxconf yield the same result as other system. It is just nicer to use for programmers. Lets describe how a programmer use the system. 2.1. One dictionnary per source directory It is best to define one message dictionnary per sub-project or sub- directory. This is easier to manage and avoid ID name space congestion. For each directory source of Linuxconf you have one "dic" file and one "m" file. Both file are produced simply by doing make msg This command scans all C++ source file of the current directory and update the file ../messages/sources/DIRECTORY.dic and the file DIRECTORY.m, where DIRECTORY is the name of the current directory. make msg use the ../translate/msgscan utility to scan the source. This utility looks for specific constructs in the C++ source file. Here they are. 2.2. The MSG_U macro The MSG_U macro defines a new message. It defines both its ID and its value. This macro is usable anywhere a C++ string would be. #include "prjfoo.m" int foo() { printf (MSG_U(M_MSG1,"Entering function foo")); } MSG_U defines a single value. U stands for unilingual. It only defines one value. 2.3. The MSG_B macro The MSG_B macro is like the MSG_U macro, except it defines two values, allowing a programmer to code immediatly two languages at once. The B stands for bilingual. This has not been used in the Linuxconf project but has proven effective for other projects. #include "prjfoo.m" int foo() { printf (MSG_U(M_MSG1 ,"Entering function foo\n")); ,"DΘmarrage de la fonction foo\n")); } 2.4. The MSG_R macro The MSG_R macro simply references an already defined message. This message may have been defined in another source file (of the same project). Like the other macros, MSG_R may be used anywhere a C++ string is. 2.5. The MSG_VERSION macro This macro has not been used so far. It would allow one programmer to raise the version number of a dictionnary, preventing older application to use the newer potentially incompatible dictionnary. The msgclean utility also plays with the version number of the dictionnary. The MSG_VERSION macro is still a concept rather than a useful addition. Stay tune... 2.6. The magic of the MSG_ macros The MSG_ macros perform two tasks. First, they are easily spotted by the msgscan utility. The parsing is simple and reliable even if the C++ source code is not functionnal. Second, they hide the retrieval mecanism (How the message value is retrieved from the binary dictionnary at runtime). The msgscan utility produce the .m file which looks like this for the simple example above. FILE prjfoo.m: extern const char **_dictionnary_prjfoo; #ifndef DICTIONNARY_REQUEST #define DICTIONNARY_REQUEST \ const char **_dictionnary_prjfoo;\ TRANSLATE_SYSTEM_REQ _dictionnary_req_prjfoo\ ("prjfoo",_dictionnary_prjfoo,55,1);\ void dummy_dict_prjfoo(){} #endif #ifndef MSG_U #define MSG_U(id,m) id #define MSG_B(id,m,n) id #define MSG_R(id) id #endif #define M_MSG1 _dictionnary_prjfoo[0] As you see, one global variable is created: _dictionnary_prjfoo. A special macro DICTIONNARY_REQUEST is defined. This macro should be placed in one source of the project. It is generally place in the file _dict.c presented later. 3. How to use it To produce a translatable program, do the following ╖ Replace all string message with MSG_U or MSG_B macros, giving each message a unique ID. ╖ include (#include) the .m file in each source file using the MSG_x macros. This file is generally named directory.m where directory is the name of the current directory. ╖ Create a file _dict.c. The content of this file is shown below. ╖ Use "make msg" to extract the messages. This produces/updates the dictionnary file directory.dic and produces the include file directory.m. ╖ Compile and link your program. ╖ Use "make msg.eng" to produce the english binary dictionnary. The file produced should be placed where your program expects it. We will now describe further the different steps involved. 3.1. The make msg command and msgscan utility The make msg command invokes the msgscan utility. This utility scan a set of C or C++ source file, updates a dictionnary file and produces one include file. Here is the command use to update the dictionnary of the sub-project uucp of the Linuxconf project. ../translate/msgscan uucp \ ../messages/sources/uucp.dic uucp.m EF *.c The first argument is the name of the dictionnary. The second argument is the path of the dictionnary file. As you see, dictionnary file are kept in a single directory for all projects. They are seldom. This eases the works of translators. The third argument is the path of the include file, which is produced in the current directory. The fourth argument is the letter tags used to identify messages defined with the macro MSG_U and MSG_B. Messages defined with MSG_U will be tagged with the letter E (English) and messages defined with MSG_B will be tagged with E for the first value and F (French) for the second. 3.2. The _dict.c file It is good pratice to place the DICTIONNARY_REQUEST macro in a file _dict.c. There is generally one such a file per directory. Its contents is generally: #include "this_directory.m" #include <translat.h> DICTIONNARY_REQUEST At least this dependancy should be placed in your makefile _dict.o: _dict.c this_directory.m This will ensure that each time you update your dictionnary (and the m header file), _dict.c will be recompile, ensuring proper recording of the dictionnary revision and number of message. This will avoid executing a program with an obsolete or incompatible binary dictionnary. Given that _dict.c is small, the compilation is pretty short. 3.3. The msgcomp utility Once you have compiled and linked your program, you must "compiled" all the dictionnaries used in your program into one binary dictionnary. This is done by the msgcomp utility. Here is the command used when doing "make msg.eng" for the Linuxconf project. This produces the english binary dictionnary. ../translate/msgcomp -p../messages/sources/ \ /tmp/linuxconf-msg-1.3.eng eE \ askrunlevel dialog dnsconf fstab \ misc main netconf mailconf uucp userconf This commands take all dictionnaries for sub-projects askrunlevel dialog dnsconf fstab misc main netconf mailconf uucp and userconf and produce a single binary dictionnary. The -p option tells msgcomp to look for those dic files ( askrunlevel.dic dialog.dic ...) in the directory ../messages/sources/. The argument /tmp/linuxconf-msg-1.3.eng is the file to produce. The argument eE instructs msgcomp to extract message'values with the 'e' tag. If there is no such value for a given message, the value with the 'E' tag will be used. 3.3.1. Convention used for tags Dictionnary file contain the definition for all messages. Each messages may have different values, identified by a tag letter. When messages are extracted by msgscan, it is instructed to associate values with given tags. By convention, we use upper case letter to identify message's value extracted from the source code. Lower case value are used by translators. We assume here that programmers are bad writters. We let them give their best shots for messages and we are allowed to override their work without overwriting it. By giving precedence to 'e' tags over 'E' we are saying that translators work override the work of programmers, but we are not forcing the translators to rewrite everything. 3.4. The msgclean utility The msgscan utility maintains dictionnary. At some point some messages may become obsolete (Unused in any source files). The msgclean is used to clean messages without values in the dic file. For the Linuxconf project, the make target msg.clean is defined for that purpose. Be aware that applying msgclean on a dictionnary file with obsolete message has an important side effect. Some message being deleted, the numbering of all following message will be changed. All source using the m include file should be recompiled. To avoid problems, the msgclean utility automaticly increases the revision number of the dictionnary. This prevents using a dictionnary with an incompatible program. 4. Usage restriction The stategy used is mainly targetted at C++ code. With some restriction, it may be used for C code. Here are the main feature that probably don't work with C. static initialisation In C++ one can write the following code. static char *tb[]={ foo(1),foo(22) }; where foo is a function. The C++ compiler will generate the proper code which will be probably called once. The MSG_U macro (and others) are not hiding function call, but are indeed dynamic in some sens. C does not support this. Other translation strategy based on dictionnary do have the same limitation though. The exemple using the static char *tb[] is also causing a problem in C++ if the variable is declared outside of a function. The problem appear because the "hidden" initialisation code generated by the compiler is called very early, often before main() is called. Normally, the function translat_load() which bring the dictionnary in memory is called by main(). Fortunatly, the current implementation, where _dictionnary_system is a pointer will trigger a seg fault whenever this condition is met. This fault will be trigger all the time, because all initialisation are called before main. The strategy is safe. 5. Recommend usage and convention 5.1. Naming convention for message's ID To help peoples who will translat your Linuxconf, I have used a convention for the ID's name. B_ Buttons. E_ Error message start with this. F_ Field labels start with this. I_ Dialog instroduction start with this. M_ All menu entries start with this prefix. N_ Notices and warning start with this. P_ When the user is prompted for a password, the message's ID start with this. Q_ Identify a question (Generally a Yes/No prompt). T_ Dialog's title start with this. X_ All other messages which fit in no category. 6. How to translate 6.1. Go simple One way to translate is to go right in the .dic files and add translations for each message using a different tag. Then use the msgcomp utility to extract the proper definition. At first, there is little problem doing this. The msgscan utility read,update and save the .dic file, so your changes won't be lost. The problem come from the way software is developped. First we develop and then, when it is stable, we translate. Doing so mean that we have to walk all the .dic files to make sure our translation still fit with the original messages (English version for example). Those original messages may have changed. A different scheme was choosen for Linuxconf. 6.2. Organisation of the messages directory The messages directory contain one subdirectory per language plus one sources directory. This directory contains all the These file are never hand edited. Each other directory has a copy of those .dic files with the proper translation. A special utility msgupd has been created: it basicly compared all messages in the sources directory with messages in the translated directory. It compare only one language (say the english version). Mostly, msgupd will tell you ╖ Which messages are new. ╖ Which messages have changed (The english wording). Using that information, you know exactly what you have to do to keep your work in sync with the current release of Linuxconf. msgupd will reorder the translated .dic file (Not the one in the sources directory) so all messages which needed work are at the beginning of the file. It also add a comment (.dic files may have comments like most normal Unix configuration file) explaining what have to be done. If the english version of the message was changed, it will retag the version in the translated file and add the new version, plus a comment. The old english message will have the tag "Z". You can see easily what is the change. 6.3. The msgupd utility The file rules.mak shows the rules for one translation (which is not done yet). Look for the target msg.cfr and upd.cfr. To add a new language, do this ╖ Create a new directory empty in the messages directory, for example, mar for Alien language. ╖ Customise rules.mak and add the target msg.mar and upd.mar. ╖ Run the following command. This will fill the messages/mar directory with all the necessary .dic files. make upd.mar ╖ Go into messages/mar and edit each .dic file and add the proper translation as needed. ╖ Run the following command to produce the binary dictionnary required to run Linuxconf. make msg.mar ╖ Set the following environnement variable and run Linuxconf. ╖ export LINUXCONF_LANG=mar ╖ export LINUXCONF_DICT=/tmp This variable is optionnal. Linuxconf will normally look for its message dictionnary in /usr/lib/linuxconf. This variable override this. The msg.* makefile's target generally produce their output in /tmp. This is useful to test new messages without breaking the current installation of Linuxconf. Be aware that this mecanism only work if you execute Linuxconf as root. For security reason, a normal user can't override the message dictionnary of Linuxconf (Although he can select a different language from /usr/lib/linuxconf if available). 6.4. The msgcomp utility The msgcomp utility has been tweaked to support the distribute directory concept. Mainly it use the .dic file in the sources directory as a reference. Message number ID are defined from this file. It then used (optionnally) alternative 7. Licensing The translate directory is part of the Linuxconf project but carry a special license. There is no resctriction on usage. Feel free to incorporate this system to any project. This simple license does not apply to the rest of Linuxconf which is covered by the standard GNU Copyleft license. See the file LICENSE in the root directory. If you find it useful for other project, send me a note and some comments if possible.